Software, systems and methods for managing a distributed network

ABSTRACT

A system and method for managing network bandwidth consumption. The system may include an agent module loadable on a networked computer and configured to aid in managing bandwidth consumption within a network. The agent module is configured to obtain an allocation of network bandwidth usable by the networked computer, and is further configured to sub-allocate such allocation among multiple bandwidth-consuming components associated with the networked computer. The system may further include multiple such agent modules loadable on plural networked computers, and a control module configured to interact with each of the agent modules to dynamically manage bandwidth usage by the networked computers.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of copending U.S. patent applicationSer. No. 10/369,259, filed Feb. 19, 2003, which is acontinuation-in-part of U.S. patent application Ser. No. 09/532,101,filed Mar. 21, 2000, which is based upon and claims the benefit under 35U.S.C. § 119 of U.S. provisional patent application Ser. No. 60/357,731,filed Feb. 18, 2002. All aforementioned applications to which thepresent application claims priority are hereby incorporated byreference.

TECHNICAL FIELD

The present invention relates generally to distributed network systems,and more particularly to software, systems and methods for managingresources of a distributed network.

BACKGROUND

Both public and private networks have shifted toward a predominantlydistributed computing model, and have grown steadily in size, power andcomplexity. This growth has been accompanied by a corresponding increasein demands placed on information technology to increase enterprise-levelproductivity, operations and customer/user support. To achieveinteroperability in increasingly complex network systems, TCP/IP andother standardized communication protocols have been aggressivelydeployed. Although many of these protocols have been effective atachieving interoperability, their widespread deployment has not beenaccompanied by a correspondingly aggressive development of managementsolutions for networks using these protocols.

Indeed, conventional computer networks provide little in the way ofsolutions for managing network resources, and instead typically providewhat is known as “best efforts” service to all network traffic. Bestefforts is the default behavior of TCP/IP networks, in which networknodes simply drop packets indiscriminately when faced with excessivenetwork congestion. With best efforts service, no mechanism is providedto avoid the congestion that leads to dropped packets, and networktraffic is not categorized to ensure reliable delivery of more importantdata. Also, users are not provided with information about networkconditions or underperforming resources. This lack of managementfrequently results in repeated, unsuccessful network requests, userfrustration and diminished productivity.

Problems associated with managing network resources are intensified bythe dramatic increase in the demand for these resources. Newapplications for use in distributed networking environments are beingdeveloped at a rapid pace. These applications have widely varyingperformance requirements. Multimedia applications, for example, have avery high sensitivity to jitter, loss and delay. By contrast, othertypes of applications can tolerate significant lapses in networkperformance. Many applications, particularly continuous mediaapplications, have very high bandwidth requirements, while others havebandwidth requirements that are comparatively modest. A further problemis that many bandwidth-intensive applications are used for recreation orother low priority tasks.

In the absence of effective management tools, the result of thisincreased and varied competition for network resources is congestion,application unpredictability, user frustration and loss of productivity.When networks are unable to distinguish unimportant tasks or requestsfrom those that are mission critical, network resources are often usedin ways that are inconsistent with business objectives. Bandwidth may bewasted or consumed by low priority tasks. Customers may experienceunsatisfactory network performance as a result of internal users placinga high load on the network.

Various solutions have been employed, with limited success, to addressthese network management problems. For example, to alleviate congestion,network managers often add more bandwidth to congested links. Thissolution is expensive and can be temporary—network usage tends to shiftand grow such that the provisioned link soon becomes congested again.This often happens where the underlying cause of the congestion is notaddressed. Usually, it is desirable to intelligently manage existingresources, as opposed to “over-provisioning,” i.e. simply providing moreresources to reduce scarcity.

A broad, conceptual class of management solutions may be thought of asattempts to increase “awareness” in a distributed networkingenvironment. The concept is that where the network is more aware ofapplications or other tasks running on networked devices, and viceversa, then steps can be taken to make more efficient use of networkresources. For example, if network management software becomes awarethat a particular user is running a low priority application, then thesoftware could block or limit that user's access to network resources.If management software becomes aware that the network population at agiven instance includes a high percentage of outside customers,bandwidth preferences and priorities could be modified to ensure thatthe customers had a positive experience with the network. In theabstract, increasing application and network awareness is a desirablegoal, however application vendors largely ignore these considerationsand tend to focus not on network infrastructure, but rather on enhancingapplication functionality.

Quality of service (“QoS”) and policy-based management techniquesrepresent efforts to bridge the gap between networks, applications andusers in order to more efficiently manage the use of network resources.QoS is a term referring to techniques which allow network-awareapplications to request and receive a predictable level of service interms of performance specifications such as bandwidth, jitter, delay andloss. Known QoS methods include disallowing certain types of packets,slowing transmission rates, establishing distinct classes of servicesfor certain types of packets, marking packets with a priority value, andvarious queuing methods. In a distributed environment having scarceresources, QoS techniques necessarily introduce unfairness into thesystem by giving preferential treatment to certain network traffic.

Policy-based network management uses policies, or rules, to define hownetwork resources are to be used. In a broad sense, a policy includes acondition and an action. An example of a policy could be to block accessor disallow packets (action) if the IP source address of the data isincluded on a list of disallowed addresses (condition). One use ofpolicy-based network management techniques is to determine when and howthe unfairness introduced by QoS methods should apply.

Policy-based management solutions typically require that network trafficbe classified before it is acted upon. The classification process canoccur at various levels of data abstraction, and may be described interms of layered communication protocols that network devices use tocommunicate across a network link. There are two protocol layeringmodels which dominate the field. The first is the OSI reference model,depicted in FIG. 1. The layers of the OSI model are: application (layer7), presentation (layer 6), session (layer 5), transport (layer 4),network (layer 3), data link (layer 2) and physical (layer 1). Thesecond major model forms the basis for the TCP/IP protocol suite. Itslayers are application, transport, network, data link and hardware, asalso depicted in FIG. 1. The TCP/IP layers correspond in function to theOSI layers, but without a presentation or session layer. In both models,data is processed and changes form as it is sequentially passed betweenthe layers.

Known policy based management solutions and QoS methods typicallyclassify data by monitoring data flows at the transport layer and below.For example, a common multi-parameter classifier is the well known“five-tuple” consisting of (IP source address, IP destination address,IP protocol, TCP/UDP source port and TCP/UDP destination port). Theseparameters are all obtained at the transport and network layers of themodels. The large majority of existing policy-based, QoS solutions areimplemented by monitoring and classifying network activity at theseprotocol layers. However, the higher the protocol layer, the moredefinitive and specific the available data and classifiers. Becauseconventional policy-based, QoS systems do not employ classifiers athigher than the transport layer, they cannot employ policy-basedtechniques or QoS methods using the richer and more detailed dataavailable at the higher layers. The conventional systems are thuslimited in their ability to make the network more application-aware andvice versa.

In addition, the known systems for managing network resources do noteffectively address the problem of bandwidth management. Bandwidth isoften consumed by low priority tasks at the expense of business criticalapplications. In systems that do provide for priority based bandwidthallocations, the bandwidth allocations are static and are not adjusteddynamically in response to changing network conditions.

SUMMARY

Accordingly, the present description provides systems and methods formanaging network bandwidth consumption, which may include use of anagent module loadable on a networked computer. The agent module isconfigured to obtain an allocation of network bandwidth usable by thenetworked computer, and is further configured to sub-allocate suchallocation among multiple bandwidth-consuming components associated withthe networked computer. The system may further include multiple suchagent modules loaded on plural networked computers, and a control moduleconfigured to interact with each of the agent modules to dynamicallymanage bandwidth usage by the networked computers.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a conceptual depiction of the OSI and TCP/IP layered protocolmodels.

FIG. 2 is a view of a distributed network system in which the software,systems and methods described herein may be deployed.

FIG. 3 is a schematic view of a computing device that may be deployed inthe distributed network system of FIG. 2.

FIG. 4 is a block diagram view depicting exemplary agent modules andcontrol modules that may be used to manage resources in a distributednetwork such as that depicted in FIG. 2.

FIG. 5 is a block diagram view depicting various exemplary componentsthat may employed in connection with the described software, systems andmethods, including two agent modules, a control module and aconfiguration utility.

FIG. 6 is a block diagram depicting an exemplary deployment of an agentmodule in relation to a layered protocol stack of a computing device.

FIG. 7 is a block diagram depicting another exemplary deployment of anagent module in relation to a layered protocol stack of a computingdevice.

FIG. 8 is a block diagram depicting yet another exemplary deployment ofan agent module in relation to a layered protocol stack of a computingdevice.

FIG. 9 is a block diagram depicting exemplary component parts of anagent module embodiment.

FIG. 10 is a block diagram depicting exemplary component parts of acontrol module embodiment.

FIG. 11A is a flowchart depicting a method for allocating bandwidthamong a plurality of computers.

FIG. 11B is a flowchart depicting another method for allocatingbandwidth among a plurality of computers.

FIG. 11C is a flowchart depicting yet another method for allocatingbandwidth among a plurality of computers.

FIG. 11D is a flowchart depicting yet another method for allocatingbandwidth among a plurality of computers.

FIG. 12 is a flowchart depicting a method for monitoring the status ofnetwork resources.

FIG. 13 is a view of a main configuration screen of a configurationutility that may be employed in connection with the software, systemsand methods described herein.

FIG. 14 is a view of another configuration screen of the configurationutility depicted in FIG. 13.

FIG. 15 is a view of yet another configuration screen of theconfiguration utility depicted in FIG. 13.

FIG. 16 is a view of yet another configuration screen of theconfiguration utility depicted in FIG. 13.

FIG. 17 schematically depicts a multi-tier control architecture that maybe employed in connection with the management software, systems andmethods described herein.

FIG. 18 depicts an exemplary network configuration allowing forcentralized definition of network policies, and dissemination of thosepolicies to pertinent distributed locations.

FIG. 19 depicts an exemplary multi-tier system and method for providinggranular control over bandwidth usage of individual bandwidth-consumingcomponents.

FIG. 20 depicts an exemplary computing device loaded with an embodimentof an agent software module configured to manage bandwidth consumptionof an individual application socket or transaction.

FIG. 21 depicts an exemplary computing device loaded with alternateembodiments of an agent module configured to monitor and manage networktraffic to and from applications running on the computing device.

DETAILED DESCRIPTION

The present description provides a system and method for managingnetwork resources in a distributed networking environment, such asdistributed network 10 depicted in FIG. 2. The software, system andmethods increase productivity and customer/user satisfaction, minimizefrustration associated with using the network, and ultimately ensurethat network resources are used in a way consistent with underlyingbusiness or other objectives.

The systems and methods may employ two main software components, anagent and a control module, also referred to as a control point. Theagents and control points may be deployed throughout distributed network10, and may interact with each other to manage network resources. Aplurality of agents may be deployed to intelligently couple clients,servers and other computing devices to the underlying network. Thedeployed agents monitor, analyze and act upon network events relating tothe networked devices with which they are associated. The agentstypically are centrally coordinated and/or controlled by one or morecontrol points. The agents and control points may interact to controland monitor network events, track operational and congestion status ofnetwork resources, select optimum targets for network requests,dynamically manage bandwidth usage, and share information about networkconditions with customers, users and IT personnel.

As indicated, distributed network 10 may include a local network 12 anda plurality of remote networks 14 linked by a public network 16 such asthe Internet. The local network and remote networks may be connected tothe public network with network infrastructure devices such as routers18.

Local network 12 typically includes servers 20 and client devices suchas client computers 22 interconnected by network link 24. Additionally,local network 12 may include any number and variety of devices,including file servers, applications servers, mail servers, WWW servers,databases, client computers, remote access devices, storage devices,printers and network infrastructure devices such as routers, bridges,gateways, switches, hubs and repeaters. Remote networks 14 may similarlyinclude any number and variety of networked devices.

Indeed, virtually any type of computing device may be connected to thenetworks depicted in FIG. 2, including general purpose computers, laptopcomputers, handheld computers, wireless computing devices, mobiletelephones, pagers, pervasive computing devices and various otherspecialty devices. Typically, many of the connected devices are generalpurpose computers which have at least some of the elements shown in FIG.3, a block diagram depiction of a computer system 40. Computer system 40includes a processor 42 that processes digital data. The processor maybe a complex instruction set computing (CISC) microprocessor, a reducedinstruction set computing (RISC) microprocessor, a very long instructionword (VLIW) microprocessor, a processor implementing a combination ofinstruction sets, a microcontroller, or virtually any otherprocessor/controller device. The processor may be a single device or aplurality of devices.

Referring still to FIG. 3, it will be noted that processor 42 is coupledto a bus 44 which transmits signals between the processor and othercomponents in the computer system. Those skilled in the art willappreciate that the bus may be a single bus or a plurality of buses. Amemory 46 is coupled to bus 44 and comprises a random access memory(RAM) device 47 (referred to as main memory) that stores information orother intermediate data during execution by processor 42. Memory 46 alsoincludes a read only memory (ROM) and/or other static storage device 48coupled to the bus that stores information and instructions forprocessor 42. A basic input/output system (BIOS) 49, containing thebasic routines that help to transfer information between elements of thecomputer system, such as during start-up, is stored in ROM 48. A datastorage device 50 also is coupled to bus 44 and stores information andinstructions. The data storage device may be a hard disk drive, a floppydisk drive, a CD-ROM device, a flash memory device or any other massstorage device. In the depicted computer system, a network interface 52also is coupled to bus 44. The network interface operates to connect thecomputer system to a network (not shown).

Computer system 40 may also include a display device controller 54coupled to bus 44. The display device controller allows coupling of adisplay device to the computer system and operates to interface thedisplay device to the computer system. The display device controller 54may be, for example, a monochrome display adapter (MDA) card, a colorgraphics adapter (CGA) card, or other display device controller. Thedisplay device (not shown) may be a television set, a computer monitor,a flat panel display or other display device. The display devicereceives information and data from processor 42 through display devicecontroller 54 and displays the information and data to the user ofcomputer system 40.

An input device 56, including alphanumeric and other keys, typically iscoupled to bus 44 for communicating information and command selectionsto processor 42. Alternatively, input device 56 is not directly coupledto bus 44, but interfaces with the computer system via infra-red codedsignals transmitted from the input device to an infra-red receiver inthe computer system (not shown). The input device may also be a remotecontrol unit having keys that select characters or command selections onthe display device.

The various computing devices coupled to the networks of FIG. 2typically communicate with each other across network links usingcommunications software employing various communications protocols. Thecommunications software for each networked device typically consists ofa number of protocol layers, through which data is sequentiallytransferred as it is exchanged between devices across a network link.FIG. 1 respectively depicts the OSI layered protocol model and a layeredmodel based on the TCP/IP suite of protocols. These two models dominatethe field of network communications software. As seen in the figure, theOSI model has seven layers, including an application layer, apresentation layer, a session layer, a transport layer, a network layer,a data link layer and a physical layer. The TCP/IP-based model includesan application layer, a transport layer, a network layer, a data linklayer and a physical layer.

Each layer in the models plays a different role in networkcommunications. Conceptually, all of the protocol layers lie in a datatransmission path that is “between” an application program running onthe particular networked device and the network link, with theapplication layer being closest to the application program. When data istransferred from an application program running on one computer acrossthe network to an application program running on another computer, thedata is transferred down through the protocol layers of the firstcomputer, across the network link, and then up through the protocollayers on the second computer.

In both of the depicted models, the application layer is responsible forinteracting with an operating system of the networked device and forproviding a window for application programs running on the device toaccess the network. The transport layer is responsible for providingreliable, end-to-end data transmission between two end points on anetwork, such as between a client device and a server computer, orbetween a web server and a DNS server. Depending on the particulartransport protocol, transport functionality may be realized using eitherconnection-oriented or connectionless data transfer. The network layertypically is not concerned with end-to-end delivery, but rather withforwarding and routing data to and from nodes between endpoints. Thelayers below the transport and network layers perform other functions,with the lowest levels addressing the physical and electrical issues oftransmitting raw bits across a network link.

The systems and methods described herein are applicable to a widevariety of network environments employing communications protocolsadhering to either of the layered models depicted in FIG. 1, or to anyother layered model. Furthermore, the systems and methods are applicableto any type of network topology, and to networks using both physical andwireless connections.

The present description provides software, systems and methods formanaging the resources of an enterprise network, such as that depictedin FIG. 2. This may be accomplished using two interacting softwarecomponents, an agent and a control point, both of which may be adaptedto run on, or be associated with, computing devices such as thecomputing device described with reference to FIG. 3. As seen in FIG. 4,a plurality of agents 70 and one or more control points 72 may bedeployed throughout distributed network 74 by loading the agent andcontrol point software modules on networked computing devices such asclients 22 and server 20. As will be discussed in detail, the agents andcontrol points may be adapted and configured to enforce system policies;to monitor and analyze network events, and take appropriate action basedon these events; to provide valuable information to users of thenetwork; and ultimately to ensure that network resources are efficientlyused in a manner consistent with underlying business or other goals.

The described software, systems and methods may be configured using athird software component, to be later discussed in more detail withreference to FIGS. 5 and 13-16. Typically, this configuration utility isa platform-independent application that provides a graphical userinterface for centrally managing configuration information for thecontrol points and agents. In addition, the configuration utility may beadapted to communicate and interface with other management systems,including management platforms supplied by other vendors.

As indicated in FIG. 4, each control point 72 is typically associatedwith multiple agents 70, and the associated agents are referred to asbeing within a domain 76 of the particular control point. The controlpoints coordinate and control the activity of the distributed agentswithin their domains. In addition, the control points may monitor thestatus of network resources, and share this information with managementand support systems and with the agents.

Control points 72 and agents 70 may be flexibly deployed in a variety ofconfigurations. For example, each agent may be associated with a primarycontrol point and one or more backup control points that will assumeprimary control if necessary. Such a configuration is illustrated inFIG. 4, where control points 72 within the dashed lines function asprimary connections, with the control point associated with serverdevice 20 functioning as a backup connection for all of the depictedagents. In addition, the described exemplary systems may be configuredso that one control point coordinates and controls the activity of asingle domain, or of multiple domains. Alternatively, one domain may becontrolled and coordinated by the cooperative activity of multiplecontrol points. In addition, agents may be configured to have embeddedcontrol point functionality, and may therefore operate without anassociated control point entity.

Typically, the agents monitor network resources and the activity of thedevice with which they are associated, and communicate this informationto the control points. In response to monitored network conditions anddata reported by agents, the control points may alter the behavior ofparticular agents in order to provide the desired network services. Thecontrol points and agents may be loaded on a wide variety of devices,including general purpose computers, servers, routers, hubs, palmcomputers, pagers, cellular telephones, and virtually any othernetworked device having a processor and memory. Agents and controlpoints may reside on separate devices, or simultaneously on the samedevice.

FIG. 5 illustrates an example of the way in which the various componentsof the described software, systems and methods may be physicallyinterconnected with a network link 90. The components are all connectedto network link 90 by means of layered communications protocol software92. The components communicate with each other via the communicationssoftware and network link. As will be appreciated by those skilled inthe art, network link 90 may be a physical or wireless connection, or aseries of links including physical and wireless segments. Morespecifically, the depicted system includes an agent 70 associated with aclient computing device 22, including an application program 98. Anotheragent is associated with server computing device 20. The agents monitorthe activity of their associated computing devices and communicate withcontrol point 72. Configuration utility 106 communicates with all of theother components, and with other management systems, to configure theoperation of the various components and monitor the status of thenetwork.

The system policies that define how network resources are to be used maybe centrally defined and tailored to most efficiently achieve underlyinggoals. Defined policies are accessed by the control points, which inturn communicate various elements and parameters associated with thepolicies to the agents within their domain. At a very basic level, apolicy contains rules about how network resources are to be used, withthe rules containing conditions and actions to be taken when theconditions are satisfied. The agents and control points monitor thenetwork and devices connected to the network to determine when variousrules apply and whether the conditions accompanying those rules aresatisfied. Once the agents and/or control points determine that actionis required, they take the necessary action(s) to enforce the systempolicies.

For example, successful businesses often strive to provide excellentcustomer services. This underlying business goal can be translated intomany different policies defining how network resources are to be used.One example of such a policy would be to prevent or limit access tonon-business critical applications when performance of business criticalapplications is degraded beyond a threshold point. Another example wouldbe to use QoS techniques to provide a guaranteed or high level ofservice to e-commerce applications. Yet another example would be todynamically increase the network bandwidth allocated to a networkedcomputer whenever it is accessed by a customer. Also, bandwidth forvarious applications might be restricted during times when there isheavy use of network resources by customers.

Control points 72 would access these policies and provide policy data toagents 70. Agents 70 and control points 72 would communicate with eachother and monitor the network to determine how many customers wereaccessing the network, what computers the customer(s) were accessing,and what applications were being accessed by the customers. Once thetriggering conditions were detected, the agents and control points wouldinteract to re-allocate bandwidth, provide specified service levels,block or restrict various non-customer activities, etc.

Another example of policy-based management would be to define an optimumspecification of network resources or service levels for particulartypes of network tasks. The particular policies would direct themanagement entities to determine whether the particular task waspermitted, and if permitted, the management entities would interact toensure that the desired level of resources was provided to accomplishthe task. If the optimum resources were not available, the applicablepolicies could further specify that the requested task be blocked, andthat the requesting user be provided with an informative messagedetailing the reason why the request was denied. Alternatively, thepolicies could specify that the user be provided with various options,such as proceeding with the requested task, but with sub-optimalresources, or waiting to perform the task until a later time.

For example, continuous media applications such as IP telephony havecertain bandwidth requirements for optimum performance, and areparticularly sensitive to network jitter and delay. Policies could bewritten to specify a desired level of service, including bandwidthrequirements and threshold levels for jitter and delay, for clientcomputers attempting to run IP telephony applications. The policieswould further direct the agents and control modules to attempt toprovide the specified level of service. Security checking could also beincluded to ensure that the particular user or client computer waspermitted to run the application. In the event that the specifiedservice level could not be provided, the requesting user could beprovided with a message indicating that the resources for the requestwere not available. The user could also be offered various options,including proceeding with a sub-optimal level of service, placing aconventional telephone call, waiting to perform the task until a latertime, etc.

The software, system and methods of the present description may be usedto implement a wide variety of system policies. The policy rules andconditions may be based on any number of parameters, including IP sourceaddress, IP destination address, source port, destination port,protocol, application identity, user identity, device identity, URL,available device bandwidth, application profile, server profile, gatewayidentity, router identity, time-of-day, network congestion, networkload, network population, available domain bandwidth and resourcestatus, to name but a partial list. The actions taken when the policyconditions are satisfied can include blocking network access, adjustingservice levels and/or bandwidth allocations for networked devices,blocking requests to particular URLs, diverting network requests awayfrom overloaded or underperforming resources, redirecting networkrequests to alternate resources and gathering network statistics.

Some of the parameters listed above may be thought of as “clientparameters,” because they are normally evaluated by an agent monitoringa single networked client device. These include IP source address, IPdestination address, source port, destination port, protocol,application identity, user identity, available device bandwidth and URL.Other parameters, such as application profile, server profile, gatewayidentity, router identity, time-of-day, network congestion, networkload, network population, available domain bandwidth and resource statusmay be though of as “system parameters” because they pertain to sharedresources, aggregate network conditions or require evaluation of datafrom multiple agent modules. Despite this, there is not a precisedistinction between client parameters and system parameters. Certainparameters, such as time-of-day, may be considered either a clientparameter or a system parameter, or both.

Policy-based network management, QoS implementation, and the otherfunctions of the agents and control points depend on obtaining real-timeinformation about the network. As will be discussed, certain describedembodiments and implementations provide improvements over knownpolicy-based QoS management solutions because of the enhanced ability toobtain detailed information about network conditions and the activity ofnetworked devices. Many of the policy parameters and conditionsdiscussed above are accessible due to the particular way the agentmodule embodiments may be coupled to the communications software oftheir associated devices. Also, as the above examples suggest, managingbandwidth and ensuring its availability for core applications is anincreasingly important consideration in managing networks. Certainembodiments described herein provide for improved dynamic allocation ofbandwidth and control of resource consumption in response to changingnetwork conditions.

The ability of the systems described herein to flexibly deploypolicy-based, QoS management solutions based on detailed informationabout network conditions has a number of significant benefits. Thesebenefits include reducing frustration associated with using the network,reducing help calls to IT personnel, increasing productivity, loweringbusiness costs associated with managing and maintaining enterprisenetworks, and increased customer/user loyalty and satisfaction.Ultimately, the systems and methods ensure that network resources areused in a way that is consistent with underlying goals and objectives.

Implementation of policy-based QoS between the application and transportlayers has another advantage. This allows support for encryption andother security implementations carried out using Virtual PrivateNetworking (VPN) or IPSec protocol.

Referring now to FIGS. 6-9, illustrative embodiments of the agent modulewill be more particularly described. The agent modules may monitor thestatus and activities of its associated client, server, pervasivecomputing device or other computing device; communicate this informationto one or more control points; enforce system policies under thedirection of the control points; and provide messages to network usersand administrators concerning network conditions. FIGS. 6-8 areconceptual depictions of networked computing devices, and show how theagent software may be associated with the networked devices relative tolayered protocol software used by the devices for network communication.

As seen in FIG. 6, agent 70 is interposed between application program122 and a communications protocol layer for providing end-to-end datatransmission, such as transport layer 124 of communications protocolstack 92. Typically, the agent modules described herein may be used withnetwork devices that employ layered communications software adhering toeither the OSI or TCP/IP-based protocol models. Thus, agent 70 isdepicted as “interposed,” i.e. in a data path, between an applicationprogram and a transport protocol layer. However, it will be appreciatedby those skilled in the art that the various agent module embodimentsmay be used with protocol software not adhering to either the OSI orTCP/IP models, but that nonetheless includes a protocol layer providingtransport functionality, i.e. providing for end-to-end datatransmission.

Because of the depicted position within the data path, agent 70 is ableto monitor network traffic and obtain information that is not availableby hooking into transport layer 124 or the layers below the transportlayer. At the higher layers, the available data is richer and moredetailed. Hooking into the stack at higher layers allows the network tobecome more “application-aware” than is possible when monitoring occursat the transport and lower layers.

The agent modules may be interposed at a variety of points betweenapplication program 122 and transport layer 124. Specifically, as shownin FIGS. 7 and 8, agent 70 may be associated with a client computer sothat it is adjacent an application programming interface (API) adaptedto provide a standardized interface for application program 122 toaccess a local operating system (not shown) and communications stack 92.In FIG. 7, agent 70 is adjacent a winsock API 128 and interposed betweenapplication program 122 and the winsock interface. FIG. 8 shows analternate configuration, in which agent 70 again hooks into a socketobject, such as API 128, but downstream of the socket interface. Witheither configuration, the agent is interposed between the applicationand transport layer 124 of communications stack 92, and is adapted todirectly monitor data received by or sent from the winsock interface.

As shown in FIG. 8, agent 70 may be configured to hook into lower layersof communications stack 92. This allows the agent to accurately monitornetwork traffic volumes by providing a correction mechanism to accountfor data compression or encryption occurring at protocol layers belowtransport layer 124. For example, if compression or encryption occurswithin transport layer 124, monitoring at a point above the transportlayer would yield an inaccurate measure of the network trafficassociated with the computing device. Hooking into lower layers withagent 70 allows network traffic to be accurately measured in the eventthat compression, encryption or other data processing that qualitativelyor quantitatively affects network traffic occurs at lower protocollayers.

An embodiment of the agent module is depicted in FIG. 9. As shown, agent70 may include a redirector module 130, a traffic control module 132, anadministrator module 134, a DNS module 136, a popapp module 138, amessage broker module 140, a system services module 142, and a popapp144. Redirector module 130 intercepts winsock API calls made byapplications running on networked devices such as the client computersdepicted in FIGS. 2 and 3. Redirector module 130 then hands these callsto one or more of the other agent components for processing. Asdiscussed with reference to FIGS. 6-8, redirector module is positionedto allow the agent to monitor data at a data transmission point betweenan application program running on the device and the transport layer ofthe communications stack. Depending on the configuration of the agentand control point, the intercepted winsock calls may be rejected,changed, or passed on by agent 70.

Traffic control module 132 implements QoS and system policies andassists in monitoring network conditions. Traffic control module 132implements QoS methods by controlling the network traffic flow betweenapplications running on the agent device and the network link. Thetraffic flow is controlled to deliver a specified network service level,which may include specifications of bandwidth, data throughput, jitter,delay and data loss.

To provide the specified network service level, traffic control module132 may maintain a queue or plurality of queues. When data is sent fromthe client to the network, or from the network to the client, redirectormodule 130 intercepts the data, and traffic module 132 places theindividual units of data in the appropriate queue. The control pointsmay be configured to periodically provide traffic control commands,which may include the QoS parameters and service specificationsdiscussed above. In response, traffic control module 132 controls thepassing of data into, through or out of the queues in order to providethe specified service level.

More specifically, the outgoing traffic rate may be controlled using aplurality of priority-based transmission queues, such as transmissionqueues 132 a. When an application or process is invoked by a computingdevice with which agent 70 is associated, a priority level is assignedto the application, based on centrally defined policies and prioritydata supplied by the control point. Specifically, as will be discussed,the control points maintain user profiles, applications profiles andnetwork resource profiles. These profiles include priority data which isprovided to the agents.

Transmission queues 132 a may be configured to release data fortransmission to the network at regular intervals. Using the parametersspecified in traffic control commands issued by a control point, trafficmodule 132 calculates how much data can be released from thetransmission queues in a particular interval. For example, if thespecified average traffic rate is 100 KBps and the queue releaseinterval is 1 ms, then the total amount of data that the queues canrelease in a given interval is 100 bits. The relative priorities of thequeues containing data to be transmitted determine how much of theallotment may be released by each individual queue. For example,assuming there are only two queues, Q1 and Q2, that have data queued fortransmission, Q1 will be permitted to transmit 66.66% of the overallallotted interval release if its priority is twice that of Q2. Q2 wouldonly be permitted to release 33.33% of the allotment. If theirpriorities were equal, each queue would be permitted to release 50% ofthe interval allotment for forwarding to the network link.

If waiting data is packaged into units that are larger than the amount agiven queue is permitted to release, the queue accumulates “credits” forintervals in which it does not release any waiting data. When enoughcredits are accumulated, the waiting message is released for forwardingto the network.

Similarly, to control the rate at which network traffic is received,traffic control module 132 may be configured to maintain a plurality ofseparate receive queues, such as receive queues 132 b. In addition tothe methods discussed above, various other methods may be employed tocontrol the rate at which network traffic is sent and received by thequeues. Also, the behavior of the transmit and receive queues may becontrolled through various methods to control jitter, delay, loss andresponse time for network connections.

The transmit and receive queues may also be configured to detect networkconditions such as congestion and slow responding applications orservers. For example, for each application, transmitted packets or otherdata units may be timestamped when passed out of a transmit queue. Whencorresponding packets are received for a particular application, thereceive and send times may be compared to detect network congestionand/or slow response times for various target resources. Thisinformation may be reported to the control points and shared with otheragents within the domain. The response time and other performanceinformation obtained by comparing transmit and receive times may also beused to compile and maintain statistics regarding various networkresources.

Using this detection and reporting mechanism, a control point may beconfigured to reduce network loads by instructing traffic control module132 to close low priority sessions and block additional sessionswhenever heavy network congestion is reported by one of the agents. Inconjunction, as will be explained, popapp 138 module may provide amessage to the user explaining why sessions are being closed. Inaddition to closing the existing sessions, the control point may beconfigured to instruct the agents to block any further sessions. Thisaction may also be accompanied by a user message in response to attemptsto launch a new application or network process. When the network load isreduced, the control point will send a message to the agents allowingsessions.

In addition to identifying congestion and slow response times, trafficcontrol module 132 may be more generally configured to aid inidentifying downed or under-performing network resources. When aconnection to a target resource fails, traffic module 132 notifiespopapp modules 138, which in turn launches an executable to perform aroot-cause analysis of the problem. Agent 70 then provides the controlpoint with a message identifying the resource and its status, ifpossible.

In addition, when a connection fails, popapp module 138 may beconfigured to provide a message to the user, including an option toinitiate an autoconnect routine targeting the unavailable resource.Enabling autoconnect causes the agent to periodically retry theunavailable resource. This feature may be disabled, if desired, to allowthe control point to assume responsibility for determining when theresource becomes available again. As will be later discussed, thedescribed system may be configured so that the control modules assumeresponsibility for monitoring unavailable resources in order to minimizeunnecessary network traffic.

As discussed below, various agent components also monitor networkconditions and resource usage for the purpose of compiling statistics.An additional function of traffic control module 132 is to aid inperforming these functions by providing information to other agentcomponents regarding accessed resources, including resource performanceand frequency of access.

As suggested in the above discussion of traffic control module 132,popapp module 138 stores and is responsible for launching a variety ofsmall application modules such as application 144, known as popapps, toperform various operations and enhance the functioning of the describedsystem. Popapps detect and diagnose network conditions such as downedresources, provide specific messages to users and IT personnel regardingerrors and network conditions, and interface with other informationmanagement, reporting or operational support systems, such as policymanagers, service level managers, and network and system managementplatforms. Popapps may be customized to add features to existingproducts, to tailor products for specific customer needs, and tointegrate the software, systems and methods with technology supplied byother vendors.

Administrator module 134 interacts with various other agent modules,maintains and provides network statistics, and provides an interface forcentrally configuring agents and other components of the system. Withregard to agent configuration, administrator module 134 interfaces withconfiguration utility 106 (shown in FIGS. 5 and 13-16), in order toconfigure various agent parameters. Administrator module 134 also servesas a repository for local reporting and statistics information to becommunicated to the control points. Based on information obtained byother agent modules, administrator module 134 maintains localinformation regarding accessed servers, DNS servers, gateways, routers,switches, applications and other resources. This information iscommunicated on request to the control point, and may be used fornetwork planning or to dynamically alter the behavior of agents. Inaddition, administrator module 134 stores system policies and/orcomponents of policies, and provides policy data to various agentcomponents as needed to implement and enforce the policies.Administrator module 134 also includes support for interfacing thedescribed software and systems with standardized network managementprotocols and platforms.

DNS module 136 provides the agent with configurable address resolvingservices. DNS module 136 may include a local cache of DNS information,and may be configured to first resolve address requests using this localcache. If the request cannot be resolved locally, the request issubmitted to a control point, which resolves the address with its owncache, provided the address is in the control point cache and the userhas permission to access the address. If the request cannot be resolvedwith the control point cache, the connected control point submits therequest to a DNS server for resolution. If the address is still notresolved at this point, the control point sends a message to the agent,and the agent then submits the request directly to its own DNS serverfor resolution.

DNS module 136 also monitors address requests and shares the content ofthe requests with administrator module 134. The requests are locallycompiled and ultimately provided to the control points, which maintaindynamically updated lists of the most popular DNS servers. In addition,DNS module 136 is adapted to interact with control point 72 in order toredirect address resolving requests and other network requests toalternate targets, if necessary.

Message broker module 140 creates and maintains connections to the oneor more control points with which the agent interacts. The various agentcomponents use the message broker module to communicate with each otherand with a connected control point. Message broker module 140 includesmessage creator and message dispatcher processes for creating andsending messages to the control points. The message creator processincludes member functions, which create control point messages byreceiving message contents as parameters and encoding the contents in astandard network format. The message creator process also includes amember function to decode the messages received from the control pointand return the contents in a format usable by the various components ofthe agent.

After encoding by the creator process, control point messages are addedto a transmission queue and extracted by the message dispatcher functionfor transmission to the control point. Messages extracted from the queueare sent to the agent's active control point. In addition, thedispatcher may be configured to ensure delivery of the message using asequence numbering scheme or other error detection and recovery methods.

Messages and communications from an agent to a control point are madeusing a unicast addressing mechanism. Communications from the controlpoint to an agent or agents may be made using unicast or a multicastaddressing scheme. When configured for multicast operation, the controlpoint and agents may be set to revert to unicast to allow forcommunication with devices that do not support IP multicast.

Once a connection with a control point is established, message brokermodule 140 monitors the status of the connection and switches over to abackup control point upon detecting a connection failure. If both theactive and backup connections are not active, network traffic is passedon transparently.

System services module 142 provides various support functions to theother agent components. First, system services module maintains dynamiclists of user profiles, server profiles, DNS server profiles, controlpoint connections and other data. The system services module alsoprovides a tracing capability for debugging, and timer services for useby other agent components. System services module may also be configuredwith a library of APIs to interface the agent with the operating systemsand other components of the device that the agent is associated with.

Referring now to FIGS. 10-12, control point 72 and its functions will bemore particularly described. As seen in FIG. 10, control point mayinclude a traffic module 160, a server profile module 162, a DNS serverprofile module 164, a gateway profile module 166, an administratormodule 168, a message broker module 170, a popapp interface 172 and apopapp 174.

Control point traffic module 160 implements policy-based, QoS techniquesby coordinating the service-level enforcement activities of the agents.As part of this function, traffic module 160 dynamically allocatesbandwidth among the agents in its domain by regularly obtainingallocation data from the agents, calculating bandwidth allocations foreach agent based on this data, and communicating the calculatedallocations to the agents for enforcement. For example, control point 72can be configured to recalculate bandwidth allocations every fiveseconds. During each cycle, between re-allocation, the agents restrictbandwidth usage by their associated devices to the allocated amount andmonitor the amount of bandwidth actually used. At the end of the cycle,each agent reports the bandwidth usage and other allocation data to thecontrol point to be used in re-allocating bandwidth.

During re-allocation, traffic module 160 divides the total bandwidthavailable for the upcoming cycle among the agents within the domainaccording to the priority data reported by the agents. The result is aconfigured bandwidth CB particular to each individual agent,corresponding to that agent's fair share of the available bandwidth. Thepriorities and configured bandwidths are a function of system policies,and may be based on a wide variety of parameters, including applicationidentity, user identity, device identity, source address, destinationaddress, source port, destination port, protocol, URL, time of day,network load, network population, and virtually any other parameterconcerning network resources that can be communicated to, or obtained bythe control point. The detail and specificity of client-side parametersthat may be supplied to the control point is greatly enhanced by theposition of agent redirector module 130 relative to the layeredcommunications protocol stack. The high position within the stack allowsbandwidth allocation and, more generally, policy implementation, to beperformed based on very specific triggering criteria. This may greatlyenhance the flexibility and power of the described software, systems andmethods.

The priority data reported by the agents may include priority dataassociated with multiple application programs running on a singlenetworked device. In such a situation, the associated agent may beconfigured to report an “effective application priority,” which is afunction of the individual application priorities. For example, ifdevice A were running two application programs and device B were runninga single application program, device A's effective application prioritywould be twice that of device B, assuming that the individual prioritiesof all three applications were the same. The reported priority data fora device running multiple application programs may be further refined byweighting the reported priority based on the relative degree of activityfor each application program. Thus, in the previous example, if one ofthe applications running on device A was dormant or idle, thecontribution of that application to the effective priority of device Awould be discounted such that, in the end, device A and device B wouldhave nearly the same effective priority. To determine effectiveapplication priority using this weighted method, the relative degree ofactivity for an application may be measured in terms of bandwidth usage,transmitted packets, or any other activity-indicating criteria.

In addition to priority data, each agent may be configured to report theamount of bandwidth UB used by its associated device during the priorperiod, as discussed above. Data is also available for each device'sallocated bandwidth AB for the previous cycle. Traffic module 160 maycompare configured bandwidth CB, allocated bandwidth AB or utilizedbandwidth UB for each device, or any combination of those threeparameters to determine the allocations for the upcoming cycle. Tosummarize the three parameters, UB is the amount the networked deviceused in the prior cycle, AB is the maximum amount they were allowed touse, and CB specifies the device's “fair share” of available bandwidthfor the upcoming cycle.

Both utilized bandwidth UB and allocated bandwidth AB may be greaterthan, equal to, or less than configured bandwidth CB. This may happen,for example, when there are a number of networked devices using lessthan their configured share CB. To efficiently utilize the availablebandwidth, these unused amounts are allocated to devices requestingadditional bandwidth, with the result being that some devices areallocated amount AB that exceeds their configured fair share CB. ThoughAB and UB may exceed CB, utilized bandwidth UB cannot normally exceedallocated bandwidth AB, because the agent traffic control moduleenforces the allocation.

Any number of processing algorithms may be used to compare CB, AB and UBfor each agent in order to calculate a new allocation, however there aresome general principles which are often employed. For example, whenbandwidth is taken away from devices, it is often desirable to firstreduce allocations for devices that will be least affected by thedownward adjustment. Thus, traffic module 160 may be configured to firstreduce allocations of clients or other devices where the associatedagent reports bandwidth usage UB below the allocated amount AB.Presumably, these devices won't be affected if their allocation isreduced. Generally, traffic module 160 should not reduce any otherallocation until all the unused allocations, or portions of allocations,have been reduced. The traffic module may be configured to then reduceallocations that are particularly high, or make adjustments according tosome other criteria.

Traffic module 160 may also be configured so that when bandwidth becomesavailable, the newly-available bandwidth is provisioned according togeneralized preferences. For example, the traffic module can beconfigured to provide surplus bandwidth first to agents that have lowallocations and that are requesting additional bandwidth. After theserequests are satisfied, surplus bandwidth may be apportioned accordingto priorities or other criteria.

FIGS. 11A, 11B, 11C and 11D depict examples of various methods that maybe implemented by traffic module 160 to dynamically allocate bandwidth.FIG. 11A depicts a process by which traffic module 160 determineswhether any adjustments to bandwidth allocations AB are necessary.Allocated bandwidths AB for certain agents are adjusted in at least thefollowing circumstances. First, as seen in steps S4 and S10, certainallocated bandwidths AB are modified if the sum of all the allocatedbandwidths ABtotal exceeds the sum of the configured bandwidths CBtotal.This situation may occur where, for some reason, a certain portion ofthe total bandwidth available to the agents in a previous cycle becomesunavailable, perhaps because it has been reserved for another purpose.In such a circumstance, it is important to reduce certain allocations ABto prevent the total allocations from exceeding the total bandwidthavailable during the upcoming cycle.

Second, if there are any agents for which AB<CB and UB≅AB, theallocation for those agents is modified, as seen in steps S6 and S10.The allocations for any such agent are typically increased. In thissituation, an agent has an allocation AB that is less than theirconfigured bandwidth CB, i.e. their existing allocation is less thantheir fair share of the bandwidth that will be available in the upcomingcycle. Also, the reported usage UB for the prior cycle is at or near theenforced allocation AB, and it can thus be assumed that more bandwidthwould be consumed by the associated device if its allocation AB wereincreased.

Third, if there are any agents reporting bandwidth usage UB that is lessthan their allocation AB, as determined at step S8, then the allocationAB for such an agent is reduced for the upcoming period to free up theunused bandwidth. Steps S4, S6 and S8 may be performed in any suitableorder. Collectively, these three steps ensure that certain bandwidthallocations are modified, i.e. increased or reduced, if one or more ofthe following three conditions are true: (1) ABtotal>CBtotal, (2) AB<CBand UB≅AB for any agent, or (3) UB<AB for any agent. If none of theseare true, the allocations AB from the prior period are not adjusted.Traffic module 160 modifies allocations AB as necessary at step S10.After all necessary modifications are made, the control pointcommunicates the new allocations to the agents for enforcement duringthe upcoming cycle.

FIG. 11B depicts re-allocation of bandwidth to ensure that totalallocations AB do not exceed the total bandwidth available for theupcoming cycle. At step S18, traffic module 160 has determined that thesum of allocations AB from the prior period exceed the availablebandwidth for the upcoming period, i.e. ABtotal>CBtotal. In thissituation, certain allocations AB must be reduced. As seen in steps S20and S22, traffic module 160 may be configured to first reduceallocations of agents that report bandwidth usage levels below theirallocated amounts, i.e. UB<AB for a particular agent. These agents arenot using a portion of their allocations, and thus are unaffected oronly minimally affected when the unused portion of the allocation isremoved. At step S20, the traffic module first determines whether thereare any such agents. At step S22, the allocations AB for some or all ofthese agents are reduced. These reductions may be gradual, or the entireunused portion of the allocation may be removed at once.

After any and all unused allocation portions have been removed, it ispossible that further reductions may be required to appropriately reducethe overall allocations ABtotal. As seen in step S24, further reductionsare taken from agents with existing allocations AB that are greater thanconfigured bandwidth CB, i.e. AB>CB. In contrast to step S22, whereallocations were reduced due to unused bandwidth, bandwidth is removedat step S24 from devices with existing allocations that exceed thecalculated “fair share” for the upcoming cycle. As seen at step S26, thereductions taken at steps S22 and S24 may be performed until the totalallocations ABtotal are less than or equal to the total availablebandwidth CBtotal for the upcoming cycle.

FIG. 11C depicts a method for increasing the allocation of certainagents. As discussed with reference to FIG. 11A, where AB<CB and UB≅ABfor any agent, the allocation AB for such an agent should be increased.The existence of this circumstance has been determined at step S40. Toprovide these agents with additional bandwidth, the allocations forcertain other agents typically need to be reduced. Similar to steps S20and S22 of FIG. 11B, unutilized bandwidth is first identified andremoved (steps S42 and S44). Again, the control point may be configuredto vary the rate at which unused allocation portions are removed. Ifreported data does not reflect unutilized bandwidth, traffic module 160may be configured to then reduce allocations for agents having anallocation AB higher than their respective configured share CB, as seenin step S46. The bandwidth recovered in steps S44 and S46 is thenprovided to the agents requesting additional bandwidth. Any number ofmethods may be used to provision the recovered bandwidth. For example,preference may be given to agents reporting the largest discrepancybetween their allocation AB and their configured share CB.Alternatively, preferences may be based on application identity, useridentity, priority data, other client or system parameters, or any othersuitable criteria.

FIG. 11D depicts a general method for reallocating unused bandwidth. Atstep S60, it has been determined that certain allocations AB are notbeing fully used by the respective agents, i.e. UB<AB for at least oneagent. At step S62, the allocations AB for these agents are reduced. Aswith the reductions and modifications described with reference to FIGS.11A, 11B and 11C, the rate of the adjustment may be varied throughconfiguration changes to the control point. For example, it may bedesired that only a fraction of unused bandwidth be removed during asingle reallocation cycle. Alternatively, the entire unused portion maybe removed and reallocated during the reallocation cycle.

In step S64 of FIG. 11D, the recovered amounts are provisioned asnecessary. The recovered bandwidth may be used to eliminate adiscrepancy between the total allocations ABtotal and the availablebandwidth, as in FIG. 11B, or to increase allocations of agents who arerequesting additional bandwidth and have relatively low allocations, asin FIG. 11C. In addition, if there is enough bandwidth recovered,allocations may be increased for agents requesting additional bandwidth,i.e. UB≅AB, even where the current allocation AB for such an agent isfairly high, e.g. AB>CB. As with the methods depicted in FIGS. 11B and11C, the recovered bandwidth may be reallocated using a variety ofmethods and according to any suitable criteria.

As indicated, traffic module 160 can be configured to vary the rate atwhich the above allocation adjustments are made. For example, assumethat a particular device is allocated 64 KBps (AB) and reports usageduring the prior cycle of 62 KBps (UB). Traffic module 160 cannotdetermine how much additional bandwidth the device would use. Thus, ifthe allocation were dramatically increased, say doubled, it is possiblethat a significant portion of the increase would go unused. However,because the device is using an amount roughly equal to the enforcedallocation AB, it can be assumed that the device would use more if theallocation were increased. Thus, it is often preferable to providesmall, incremental increases. The amount of these incrementaladjustments and the rate at which they are made may be configured withthe configuration utility, as will be discussed with reference to FIG.16. If the device consumes the additional amounts, successive increasescan be provided if additional bandwidth is available.

In addition, the bandwidth allocations and calculations may be performedseparately for the transmit and receive rates for the networked devices.In other words, the methods described with reference to FIGS. 11A-11Dmay be used to calculate a transmit allocation for a particular device,as well as a separate receive allocation. Alternatively, thecalculations may be combined to yield an overall bandwidth allocation.

Server profile module 162, DNS server profile module 164, gatewayprofile module 166 and administrator module 168 all interact with theagents to monitor the status of network resources. More specifically,FIG. 12 provides an illustrative example of how the control points andagents may be configured to monitor the status of resources on thenetwork. The monitored resource(s) may be a server, a DNS server, arouter, gateway, switch, application, etc. At step S100, a resourcestatus change has occurred. For example, a server has gone down, trafficthrough a router has dramatically increased, a particular application isunavailable, or the performance of a particular gateway has degradedbeyond a predetermined threshold specified in a system policy.

At step S102, a networked device attempts to access the resource orotherwise engages in activity on the network involving the particularresource. If the accessing or requesting device is an agent, anexecutable spawned by popapp module 138 (FIG. 9) analyzes the resource,and reports the identity and status of the resource to the control pointconnected to the agent, as indicated at step S104. Launch of the popappmay be triggered by connection errors, or by triggering criteriaspecified in system policies. For example, system policies may includeperformance benchmarks for various network resources, and may furtherspecify that popapp analysis is to be performed when resourceperformance deviates by a certain amount from the established benchmark.In addition, the control points may similarly be configured to launchpopapps to analyze network resources.

Once the control point obtains the status information, the control pointreports the information to all of the agents in its domain, andinstructs the agents how to handle further client requests involving theresource, as indicated at steps S108 and S110. In the event that thetarget resource is down, underperforming or otherwise unavailable, theinstructions given to the agents will depend on whether an alternateresource is available. The control point stores dynamically updatedlists of alternate available resources. If an alternate resource isavailable, the instructions provided to the agent may include aninstruction to transparently redirect the request to an alternateresource, as shown in step S108. For example, if the control point knowsof a server that mirrors the data of another server that has gone down,client requests to the down server can simply be redirected to themirror server. Alternatively, if no alternate resource is available, theagent can be instructed to provide a user message in the event of anaccess attempt, as seen in step S110. The messaging function is handledby agent popapp module 138. In addition, popapp functionality may beemployed by the control point to report status information to othercontrol points and management platforms supplied by other vendors. Inaddition, messages concerning resource status or network conditions maybe provided via email or paging to IT personnel.

Still referring to FIG. 12, the control point may be configured toassume responsibility for tracking the status of the resource in orderto determine when it again becomes available, as shown in step S112. Aslow polling technique is used to minimize unnecessary traffic on thenetwork. During the interval in which the resource is unavailable, theagents either redirect requests to the resources or provide errormessages, based on the instructions provided by the control point, asshown in step S116. Once the control point determines that the resourceis again available, the control point shares this information with theagents and disables the instructions provided in steps S108 and S110, asshown in step S118.

This method of tracking and monitoring resource status has importantadvantages. First, it reduces unnecessary and frustrating accessattempts to unavailable resources. Instead of repeatedly attempting toperform a task, a user's requests are redirected so that the request canbe serviced successfully, or the user is provided with information aboutwhy the attempt(s) was unsuccessful. With this information in hand, theuser is less likely to generate wasteful network traffic with repeatedaccess attempts in a short period of time. In addition, network trafficis also reduced by having only one entity, usually a control point,assume responsibility for monitoring a resource that is unavailable.

In addition to assisting these resource monitoring functions, serverprofile module 162 maintains a dynamically updated list of the serversaccessed by agents within its domain. The server statistics may beretrieved using the configuration utility, or with a variety of otherexisting management platforms. The server statistics may be used fornetwork planning, or may be implemented into various system policies fordynamic enforcement by the agents and control points. For example, thecontrol points and agents can be configured to divert traffic fromheavily used servers or other resources.

DNS module 164 also performs certain particularized functions inaddition to aiding the resource monitoring and tracking described withreference to FIG. 12. Specifically, the DNS module maintains a local DNScache for efficient local address resolution. As discussed withreference to agent DNS module 136, the agents and control pointsinteract to resolve address requests, and may be configured to resolveaddresses by first referencing local DNS data maintained by the agentsand/or control points. Similar to server profile module 162, DNS module164 also maintains statistics for use in network planning and dynamicsystem policies.

In addition to the functions described above, administrator module 168maintains control point configuration parameters and distributes thisinformation to the agents within the domain. Similar to the server, DNSand gateway modules, administrator module 168 also aids in collectingand maintaining statistical data concerning network resources. Inaddition, administrator module 168 retrieves policy data fromcentralized policy repositories, and stores the policy data locally foruse by the control points and agents in enforcing system policies.

Control point 72 also includes a synchronization interface (not shown)for synchronizing information among multiple control points within thesame domain.

Message broker module 170 performs various functions to enable thecontrol point to communicate with the agents. Similar to agent messagebroker module 140, message broker module 170 includes message creatorand message dispatcher processes. The message creator process includesmember functions that receive message contents as parameters and returnencoded messages for transmission to agents and other network entities.Functions for decoding received messages are also included with themessage creator process. The dispatcher process transmits messages andensures reliable delivery through a retry mechanism and error detectionand recovery methods.

Referring now to FIGS. 13-16, both the agents and control points may beconfigured using configuration utility 106. Typically, configurationutility 106 is a platform-independent application that provides agraphical user interface for centrally managing configurationinformation for the control points and agents. To configure the controlpoints and agents, the configuration utility interface withadministrator module 134 of agent 70 and with administrator module 168of control point 72. Alternatively, configuration utility 106 mayinterface with administrator module 168 of control point 72, and thecontrol point in turn may interface with administrator module 134 ofagent 70.

FIG. 13 depicts a main configuration screen 188 of configuration utility106. As indicated, main configuration screen 188 can be used to viewvarious managed objects, including users, applications, control points,agents and other network entities and resources. For example, screenframe 190 on the left side of the main configuration screen 188 may beused to present an expandable representation of the control points thatare configured for the network.

When a particular control point is selected in the main configurationscreen 188, various settings for the control point may be configured.For example, the name of the control point may be edited, agents andother entities may be added to the control point's domain, and thecontrol point may be designated as a secondary connection for particularagents or groups of agents. In addition, the system administrator mayspecify the total bandwidth available to agents within the controlpoint's domain for transmitting and receiving, as shown in FIG. 14. Thisbandwidth specification will affect the configured bandwidths CB andallocated bandwidths AB discussed with reference to control pointtraffic module 160 and the method depicted in FIG. 11.

Configuration utility 106 also provides for configuration of varioussettings relating to users, applications and resources associated with aparticular control point. For example, users may be grouped together forcollective treatment, lists of prohibited URLs may be specified forparticular users or groups of users, and priorities for applications maybe specified, as shown in FIG. 15. Priorities may also be assigned tousers or groups of users. As discussed above, this priority data plays arole in determining bandwidth allocations for the agents and theirassociated devices.

In addition, optimum and minimum performance levels may be establishedfor applications or other tasks using network resources. Referring againto the IP telephony example discussed above, the configuration utilitymay be used to specify a minimum threshold performance level for anetworked device running the IP telephony application. This performancelevel may be specified in terms of QoS performance parameters such asbandwidth, throughput, jitter, delay and loss. The agent moduleassociated with the networked device would then monitor the networktraffic associated with the IP telephony application to ensure thatperformance was above the minimum threshold. If the minimum level wasnot met, the control points and agents could interact to reallocateresources and provide the specified minimum service level. Similarly, anoptimum service level may be specified for various network applicationsand tasks. More generally, configuration utility 106 may be configuredto manage system policies by providing functionality for authoring,maintaining and storing system policies, and for managing retrieval ofsystem policies from other locations on a distributed network, such as adedicated policy server.

Referring now to FIG. 16, the configuration of various other controlpoint and agent parameters will be discussed. As seen in the figure,configuration utility 106 may be used to configure the interval at whichresource reallocation is performed. For example, the default intervalfor recalculating bandwidth allocations is 5000 milliseconds, or 5seconds. Also, as discussed above, the rate at which resource allocationoccurs may be specified in order to prevent overcompensation,unnecessary adjustments to allocations, and inefficientreconfigurations. Specifically, the percentage of over-utilizedbandwidth that is removed from a client device and reallocated elsewheremay be specified with the configuration utility, as seen in FIG. 16. Inaddition, the rate at which agents provide feedback to the controlpoints regarding network conditions or activities of their associateddevices may be configured.

In many of the examples discussed above, the systems and methods areimplemented architecturally in two tiers. The first tier may include oneor more control modules, such as control points 72. Because the controlpoints control and coordinate operation of agent modules 70, the controlpoints may be referred to as “upstream” or “overlying” components,relative to the agent modules that they control. By contrast, theagents, which form the second tier of the system, may be referred to as“downstream” or “underlying” components, relative to the control pointsthey are controlled by.

It will be appreciated that the systems and methods described herein areextremely flexible and scalable, and may be applied to networks ofwidely varying size. In some settings, scaling is achieved by extendinghierarchical implementations to three or more tiers. Indeed, a verylarge architectural model may be built by extending the two-tier exampleabove on the upstream and/or downstream side.

Such a model can be used to deliver consistent application performancein a very large and complex network configuration. An upstream componentmay control one or more downstream components. An upstream component canalso be configured to be a downstream component of some othercontrolling upstream entity. Similarly, a downstream component may beconfigured to be an upstream component to control its downstreamdelegates.

FIG. 17 depicts an exemplary multi-tier implementation. In this example,the Tier 1 component is upstream relative to three components in Tier 2.Because the Tier 2 components underlie and are controlled by the Tier 1component, the Tier 2 components are downstream components relative tothe Tier 1 component. As indicated, the components within Tiers 2, 3 and4 may be configured to function as both downstream and upstreamcomponents.

The agent modules and control points described herein may be implementedat any level within a multi-tier environment such as that shown in FIG.17. For example, the Tier 4 component may include a control module 72,as shown, that is configured to control agent modules 70 that may berunning on the underlying Tier 5 components.

Hierarchically, the structure of FIG. 17 typically is implemented in aone-to-many configuration moving downstream. In other words, an upstreamcomponent may control multiple downstream components in the tierimmediately below it, and those downstream components may controlmultiple components in further downstream tiers. However, a givendownstream component typically reports to and is controlled by only asingle upstream component. It should be appreciated that a wide varietyof configurations are possible, with any desired number of tiers andcomponents or groupings of components within the tiers.

In multi-tier environments such as that described above, it will oftenbe desirable to provide mechanisms for centrally defining anddistributing policies relating to management of bandwidth and otherresources. FIG. 18 depicts an exemplary networking environment employingsuch mechanisms. In the example, various branch offices are connected toa centralized data center 200 via network 202. The system may beconfigured to enable an administrator to define enterprise wide policieson a central server such as an Enterprise Policy Server (EPS) 204. Insuch an environment, the control point modules described herein may beimplemented in connection with a Controlled Location Policy Server(CLPS) 206, as shown in the exemplary branch office 208 at the bottom ofFIG. 18.

Typically, a given CLPS 206 retrieves policies pertinent to its branchoffice from its controlling EPS 204. The CLPS, which may include acontrol module 72, then distributes pertinent policies to the relevantagent modules 70 within its domain. The retrieved policies are thendistributed to the various agents that are controlled by that controlpoint module. This policy definition and distribution scheme may easilybe adapted and scaled to manage widely varying enterpriseconfigurations. The distributed policies may, among other things, beused to facilitate the bandwidth management techniques described herein.

In addition to or instead of the previously described bandwidthmanagement features, bandwidth management may be effected using tieredimplementations such as that described above. FIG. 19 depicts anexemplary system 300 for managing bandwidth on network link 302.

As shown, system 300 may include various components referred to asbandwidth managers, which may be implemented within the control modules(e.g., control points 72) and agent modules (e.g., agent modules 70)described herein. For example, the depicted system includes a controlledlocation bandwidth manager (CL-BWMGR) 304, which may be implementedwithin previously described control point 72, and/or within a ControlledLocation Policy Server 206, such as shown in FIG. 18. In any case,CL-BWMGR 304 typically is implemented as a software program running on acomputing device, such as a server (not shown), connected to networklink 302.

The CL-BWMGR typically communicates with agent software (e.g., agentmodules 70) running on computing devices connected to network link 302,so as to manage bandwidth usage by those devices. For example, as shown,plural computing devices 22, also referred to as agent computers, may beinterconnected via network link 302. Each agent computer may be runningan agent bandwidth manager (AG-BWMGR) 306, one or more applicationbandwidth managers (APP-BWMGR) 308, and one or more socket bandwidthmanagers (SOCK-BWMGR) 310. The bandwidth managers loaded on agentcomputer 22 typically are sub-components of agent module 70, discussedabove.

Within a given computing device 22 connected to network link 302,typically there is one AG-BWMGR 306, and a variable number of APP-BWMGRs308 and SOCK-BWMGRs 310, depending on the applications and sockets beingutilized. For example, when the bandwidth managers are implementedwithin an agent module 70, the agent module typically is adapted tolaunch an APP-BWMGR 308 for each application 320 running on computer 22,and a SOCK-BWMGR 310 for each socket 322 open by each application 320.As indicated, a given computing device may be running multipleapplications 320, and a given application 320 may have multiple opensockets 322. Computing devices 22 transmit and receive data on networklink 302 via transmit and receive control sections 324 and 326. Sections324 and 326 may respectively include transmit and receive queues 328 and330, as will be explained below.

In the depicted example, CL-BWMGR 304 manages overall bandwidth onnetwork link 302, a given AG-BWMGR 306 manages bandwidth usage by itsassociated agent computer 22, a given APP-BWMGR 308 manages bandwidthusage by its associated application 320, and a given SOCK-BWMGR 310manages bandwidth usage by its associated socket 322.

As indicated by the arrows interconnecting the different bandwidthmanagers, the bandwidth managers may be arranged in a hierarchicalcontrol configuration, in which an upstream component interacts withand/or controls one or more underlying downstream components. In thedepicted example, the CL-BWMGR manages bandwidth usage on network link302 by interacting with underlying AG-BWMGRs, so as to manage bandwidthusage by the particular downstream computing devices 22, applications320 and sockets 322 that underlie the CL-BWMGR. It should be appreciatedthat the CL-BWMGR may manage bandwidth usage on any type ofinterconnection between computers. For example, in FIG. 2, a givenremote network 14 could include a server computer running a CL-BWMGR 304interconnected via a local network segment with plural client computers,where each client computer was loaded with an agent module 70. In such acase, the control computer could control the clients' bandwidth usage onnot only the local network segment (e.g., an Ethernet-based LAN), butalso on the connection through router 18 to public network 16.

Referring again to FIG. 19, a given AG-BWMGR 306 manages bandwidth usageby its associated computing device 22 by interacting with underlyingAPP-BWMGRs, so as to manage bandwidth usage by the particular downstreamapplications 320 and sockets 322 that underlie the AG-BWMGR. Likewise, agiven APP-BWMGR 308 manages bandwidth usage by its associatedapplication 320 by interacting with underlying SOCK-BWMGRs, so as tomanage bandwidth usage by the particular downstream sockets 322 thatunderlie the APP-BWMGR. As explained below, downstream componentstypically facilitate control by reporting certain data, such asbandwidth consumption, upstream to overlying upstream components.

In the depicted example, bandwidth may be managed by taking a bandwidthallocation existing at a particular hierarchical level, andsub-allocating the allocation for apportionment amongst downstreamcomponents. Where there are multiple downstream components,sub-allocation typically involves dividing the allocation into portionsfor each of the downstream components.

For example, CL-BWMGR 304 may have an allocation corresponding to theavailable bandwidth on network link 302. The available bandwidth may beconfigured according to a system policy specifying parameters fornetwork link 302, or may be determined or configured through othermethods. The available bandwidth on link 302 may be sub-allocated byCL-BWMGR 304 into one or more agent allocations, depending on the numberof agent computers 22 underlying the CL-BWMGR. Each agent allocationrepresents the amount of bandwidth allotted to the respective agentcomputer. For example, if one hundred agent computers 22 are controlledby the CL-BWMGR, then the CL-BWMGR would typically sub-allocateavailable link bandwidth into one hundred individualized agentallocations for the respective underlying agent computers. Theindividualized agent allocation would be provided to the respectiveAG-BWMGRs at each agent computer 22 for enforcement and furthersub-allocation.

Similarly, at each agent computer, the associated AG-BWMGR 306 maysub-allocate its agent allocation into individualized applicationallocations for each application 320 running on the agent computer. Agiven application allocation represents the amount of bandwidth allottedto the corresponding application. These application allocationstypically are provided to the respective APP-BWMGRs for enforcement andfurther sub-allocation. Finally, at each application 320, the associatedAPP-BWMGR 308 may sub-allocate its application allocation intoindividual socket allocations for the sockets 322 that are open for thatapplication. The individual socket allocations typically are provided tothe respective SOCK-BWMGRs 310.

Typically, at least some of the bandwidth managers are configured tocommunicate upstream in order to facilitate the various sub-allocationsdiscussed above. Indeed, data may be provided upstream from each groupof SOCK-BWMGRs 310 to their overlying APP-BWMGR 308, from each group ofAPP-BWMGRs 308 to their overlying AG-BWMGR 306, and from each group ofAG-BWMGRs 306 to their overlying CL-BWMGR 304. The data providedupstream may be used to calculate sub-allocations, which are then sentback downstream for enforcement, and/or further sub-allocation. Often,it will be desirable that the data which is sent upstream pertain tosocket activity, so as to efficiently adjust future allocations to takebandwidth from where it is less needed and provide it to where it ismore needed.

Indeed, the interaction between the various bandwidth managers typicallyis performed to efficiently allocate network bandwidth among the variousbandwidth-consuming components. As indicated above, consuming componentsmay include agent computers 22, applications 320 running on thosecomputers, and sockets 322 open for those applications. One way in whichthe systems described herein may efficiently allocate bandwidth is byshifting bandwidth allocations toward high priority uses and away fromuses of relatively lower priority. Another allocation criteria may beemployed which involves providing future allocations based onconsumption of past allocations. For example, allocations to aparticular bandwidth manager (e.g., an application allocation providedfrom an AG-BWMGR 306 to an APP-BWMGR 308) may be reduced if pastallocations were only partially consumed, or if there is a trend ofdiminished usage.

The interactions discussed above need not occur in any particular order,and may occur periodically or non-periodically, and/or at differentrates for different levels. For example, CL-BWMGR 304 may hand out agentallocations for computers 22 at regular intervals, or only when achanged condition within the network is detected. Applicationallocations and socket allocations may be disseminated downstreamperiodically, but at different rates. In any case, it will often beadvantageous to repeatedly and dynamically update the allocations andsub-allocations, as will be discussed in more detail below.

Bandwidth management for a representative socket will now be describedwith reference to FIG. 20, which shows an agent computing device 22coupled to other computers (not shown) via network link 302. Computingdevice 22 is running an application program 320 which communicates overnetwork link 302 via a layered protocol stack (depicted in part at 92),as described with reference to earlier examples. Agent module 70 may beinterposed between application program 320 and lower layers of stack 92,typically at a point where flow control may be achieved. Specifically,as described with reference to FIGS. 6, 7 and 8, agent module 70typically is positioned in a relatively high location within theprotocol stack and interfaced with a socket object, to allow networktraffic to be intercepted, or hooked into, at a point betweenapplication program 320 and transport layer 124 of stack 92. The agentmodule 70 depicted in FIG. 20 may include some or all of the componentsdescribed with reference to FIG. 9. In particular, a redirector 130(FIG. 9) may be employed to allow agent module to intercept, or hookinto, the socket data flow between application 320 and network link 302.

For the depicted representative socket, traffic between applicationprogram 320 and network link 302 flows through transmit control section324 and receive control section 326, which may include transmit queues328 and receive queues 330, respectively. Data flows through controlsections 324 and 326 may be monitored and controlled via SOCK-BWMGR 310of agent module 70. As described above, SOCK-BWMGR 310 may interact withupstream bandwidth managers (not shown in FIG. 20) implemented withinagent module 70. Specifically, socket allocations may be provided toSOCK-BWMGR 310, which controls transmit control section 324 and receivecontrol section 326 to ensure that those allocations are enforced. Also,as indicated, SOCK-BWMGR 310 may provide feedback (e.g., concerningsocket activity) to various upstream bandwidth managers. This feedbackmay be used to facilitate future allocations.

Socket allocations may be provided to SOCK-BWMGR 310 periodically atregular intervals. In such a case, the allocation typically is in theform of a number of bytes that may be transmitted, and/or a number ofbytes that may be received, during a set interval (there may be separatetransmit and receive allocations). During the interval, data passesthrough control sections 324 and 326, and SOCK-BWMGR 310 monitors theamount of data sent to, or received from, network link 302. Providedthat the allocation for a given interval is not exceeded, SOCK-BWMGR 310may simply passively monitor the traffic. However, once the allocationfor a particular interval is used up, further transmission and receptionis prevented, until a new allocation is obtained (e.g., during asucceeding interval). As indicated, queues 328 and 330 may be providedto hold overflow, until the socket is replenished with a new allocation,at which point the queued data may be transmitted and/or received. Whenan allocation is not exceeded, queues 328 and 330 typically are notused, such that transmitted or received data passes through sections 324and/or 326 without any queuing of data. In these implementations,queuing is triggered only when the socket allocation for a giveninterval is exceeded.

In many cases, it will be desirable that the upstream feedback for thesocket (e.g., the feedback sent upstream by SOCK-BWMGR 310) includespecification of how much bandwidth was consumed by the socket (e.g.,how many bytes were sent or received). This consumption data may be usedby upstream bandwidth managers in performing sub-allocations, as will beexplained below.

In connection with the agent device management described herein, theterm socket should be broadly understood to mean network conversationsor connections carried out by applications above the transport protocollayer (e.g., layer 124 in FIGS. 6, 7, 8 and 20). In most cases, theseconversations or connections are characterized at least in part by theability to implement flow control. Accordingly, it will be appreciatedthat the agent modules are not limited to any one particular standardthat is employed within the upper layers of the protocol stack. Indeed,the agent modules described herein may hook into network conversationscarried on by applications that do not use the pervasive Winsockstandard.

Referring particularly to FIG. 21, exemplary agent computer 22 is shownas running applications 320 a, 320 b, and 320 c. Application 320 aemploys the Winsock standard, and thus communicates with network link302 via Winsock API 128 and lower layers of protocol stack (e.g., thelayers shown at 340 and 342). Accordingly, agent module 70 a may beinterposed as shown relative to API 128 to hook into the networkcommunications and carry out the monitoring and management functionsdescribed at length herein.

In contrast, applications 320 b and 320 c do not work with the Winsockstandard. An example of such an application is an Active Directoryapplication that employs the NetBios standard. For these non-winsockapplications, an alternate agent module 70 b may be provided. Similar tomodule 70 a and the previously discussed agent module embodiments, agentmodule 70 b is able to hook into network conversations above thetransport layer. The module does not require winsock, though it is stillactive at a high protocol layer where flow control may be employed andwhere a rich variety of parameters may be accessed to enhance monitoringand policy-based management.

Returning to the discussion of bandwidth management, various differentmethodologies may be employed to provide downstream sub-allocations ofbandwidth. Many of these methodologies employ a priority scheme, inwhich priorities may be assigned via network policies. Priority may beassigned based on a variety of parameters, including IP source address,IP destination address, source port, destination port, protocol,application identity, user identity, device identity, URL, availabledevice bandwidth, application profile, server profile, gateway identity,router identity, time-of-day, network congestion, network load, networkpopulation, available domain bandwidth and status of network resources.Other parameters are possible. As discussed above, the position of thedescribed agent modules relative to the protocol stack allows access tomany different parameters. The ability to predicate bandwidth managementon such varying criteria allows for increased control over bandwidthallocations, and thereby increases efficiency.

In typical implementations, the assigned priorities discussed above maybe converted into or expressed in terms of a priority level that isspecific to a given network “conversation,” or connection, such as asocket data flow. Indeed, many of the sub-allocation schemes discussedbelow depend on socket priorities for individual sockets. In thepolicy-based systems described herein, the socket priorities typicallyare configured values derived from various parameters such as thoselisted above. For example, the configured priority for a given socketmight be determined by a combination of user identity, the type ofapplication being run, and the source address of the requested data, toname but one example.

Referring again to FIG. 19, various exemplary methods for sub-allocatingbandwidth will now be discussed. Beginning with the sub-allocation of anapplication allocation into one or more socket allocations, as may beperformed by APP-BWMGR 308, assume that there are k sockets 322 open fora given application 320. For a given socket s (where s=1 through k), aneffective socket priority ESP(s) may be calculated as follows:$\begin{matrix}{{{{ESP}(s)} = {{{SP}(s)} \times \frac{{cons\_ last}{\_ N}(s)}{{alloc\_ last}{\_ N}(s)} \times {factor}}},} & (1)\end{matrix}$where SP(s) is the configured priority for the socket (e.g., asdetermined by network policies); cons_last_N(s) is the bandwidthconsumed by the socket during the last N allocation cycles; andalloc_last_N(s) is the bandwidth allocated to the socket over the last Ncycles. In addition, the result may be multiplied by a convenient factorto avoid handling of fractional amounts.

Then, the socket allocation SALLOC(s) for a given socket may becalculated as: $\begin{matrix}{{{{SALLOC}(s)} = {{APALLOC}\quad\frac{{ESP}(s)}{\sum\limits_{s = 1}^{k}{{ESP}(s)}}}},} & (2)\end{matrix}$where APALLOC is the application allocation being apportioned by theAPP-BWMGR among the various sockets; and ESP(s) is the effectivepriority of the socket. It should thus be appreciated that theapportionment is derived via a weighted average of the effective socketpriorities.

Accordingly, in the above example, socket allocations will tend to behigher for sockets having higher priorities, and for sockets consuminglarger percentages of prior allocations. In the above example,calculations are based on consumption data from multiple past allocationcycles (N cycles), though a single cycle calculation may be used. Insome cases, multiple cycle calculations may smooth transitions andstabilize adjustment of allocations among the various sockets.Typically, the calculations described above are performed by theoverlying APP-BWMGR based on bandwidth consumption data received fromunderlying SOCK-BWMGRs.

Turning now to the sub-allocation of agent allocations into one or moreapplication allocations, as may be performed by AG-BWMGR 306, assumethat there are j applications 320 running on a given computer 22. First,an effective application priority may be calculated for eachapplication. Using the above application as an example, the effectiveapplication priority EAPP may be calculated as follows: $\begin{matrix}{{{EAPP} = {\sum\limits_{s = 1}^{k}{\frac{{cons\_ last}{\_ N}{(s) \times {{SP}(s)}}}{{alloc\_ last}{\_ N}(s)} \times {factor}}}},} & (3)\end{matrix}$Again, a factor may be employed to avoid handling of fractions, or tootherwise facilitate processing. A similar calculation is performed forall of the other applications 320 underlying the AG-BWMGR. Then, similarto the socket sub-allocations above, the individual applicationallocations may be produced with a weighted average. Specifically, for agiven application ap (where ap=1 through j), an application allocationAPPALLOC(ap) may be calculated as follows: $\begin{matrix}{{{{APALLOC}({ap})} = {{AGALLOC} \times \frac{{EAPP}({ap})}{\sum\limits_{{ap} = 1}^{j}{{EAPP}({ap})}}}},} & (4)\end{matrix}$where AGALLOC is the agent allocation being apportioned by the AG-BWMGRamong the various applications; and EAPP(ap) is the effective priorityof the application for which the allocation is being calculated.Typically, the calculations described above are performed by theoverlying AG-BWMGR based on bandwidth consumption data received fromunderlying APP-BWMGRs 308. In the above example, application allocationswill tend to be higher for applications having more sockets open, higherpriority sockets, and sockets consuming greater percentages of priorsocket allocations.

Turning now to the sub-allocation of link bandwidth (e.g., overallbandwidth available on network link 302) into one or more agentallocations, as may be performed by CL-BWMGR 304, assume that there arei agent computers 22 interconnected by network link 302. As with theabove sub-allocations, an effective priority may be calculated and thenbandwidth may be apportioned according to a weighted average of theeffective priorities. Specifically, using the above agent as an example,the effective agent priority EAGP may be calculated by summing theunderlying effective application priorities as follows: $\begin{matrix}{{EAGP} = {\sum\limits_{{ap} = 1}^{j}{{EAPP}({ap})}}} & (5)\end{matrix}$

A similar calculation may be performed for all of the other agentcomputers underlying the CL-BWMGR, in order to obtain all of theeffective agent priorities. In some cases, it may be desirable to weightthe newly calculated value with the prior value to obtain the effectivepriority, in order to smooth adjustments and bandwidth reallocationsamong the agent computers. In certain implementations, for example, ithas proved advantageous to calculate the effective agent priority byblending the new value (derived from above equation) with the mostrecent value in a 60-40 ratio. Any other desirable weighting may beemployed, and/or other methods may be used to smooth allocationtransitions. In any case, the sub-allocation to the agent computers maybe effected with a weighted average: $\begin{matrix}{{{{AGALLOC}({ag})} = {{BWCL} \times \frac{{EAGP}({ag})}{\sum\limits_{{ag} = 1}^{i}{{EAGP}({ag})}}}},} & (6)\end{matrix}$where AGALLOC(ag) is the agent allocation for a given agent computer ag(ag=1 through i); BWCL is the bandwidth available to be allocated amongall the agent computers; and EAGP(ag) is the effective agent priorityfor the given agent computer.

In the above examples, the activity of individual sockets typically iscommunicated upstream (e.g., from a SOCK-BWMGR to an overlyingAPP-BWMGR, to an overlying AG-BWMGR, and to an overlying CL-BWMGR), andhas an effect on future allocations and sub-allocations throughout thesystem. Consumption at a particular socket may affect future allocationsto that socket and other sockets on the same application. In addition,because the consumption data is communicated upstream and used in otherallocations and sub-allocations, the same socket can potentially affectfuture allocations to all sockets, applications and agent computerswithin the system. In certain implementations this feedback effect cangreatly increase the efficiency of bandwidth management.

Those skilled in the art will appreciate that allocations andsub-allocations may be performed separately for transmitted data andreceived data. In addition, the upstream feedback and downstreamsub-allocations may occur at any desired frequency, or at irregularintervals. Within agent computer 22, the sub-allocations typically areperformed at regular intervals, though it will often be desirable tore-calculate socket allocations at a greater frequency than theapplications allocations. As an example, agent allocations may beupdated every 2 seconds, application allocations every half second, andsocket allocations every 100 milliseconds.

The exemplary equations above apply primarily during steady stateoperation, when all involved sockets are open, have non-zeroallocations, and are at least somewhat active (e.g., consuming somebandwidth). Typically, some provision or modification is made forstartup conditions, and for conditions where sockets are idle and/orhave socket allocations that are negligible or zero. For example, onoccasion an agent computer may have launched applications that are notcommunicating with the network. In this case, the effective applicationpriorities would all be zero because there is no socket activity,leading to an undefined result in equation (4). A straight average maybe used to address this condition, in which case available bandwidth isapportioned among the applications equally. Equal apportioning may alsobe used in the case of undefined results in equations (2) and (6).

Referring now to equation (1), the APP-BWMGRs and/or other bandwidthmanagers may be configured to over-provision bandwidth. This may be doneto ensure that truly idle sockets maintain an effective socket priorityof zero, and thus do not obtain positive allocations (e.g., throughapplying the weighted average of equation (2)). Over-provisioning mayalso be used to provide internal allocations to idle sockets that arebecoming active, so as to allow them to achieve a non-zero effectivesocket priority and thereby obtain a share of the application allocationbeing sub-allocated by the overlying APP-BWMGR 308.

Regardless of the particular methods employed, bandwidth may beapportioned among entities or components at a particular level based onsocket status of a given component, or of components underlying thatcomponent. Socket status may include the number of sockets open withinthe domain of a given control module, on a given computing device, orfor a given application. Equations (3) and (4) above provide an exampleof apportioning bandwidth among applications based partly on the numberof sockets open at each application. All other things being equal, inthese equations an application with more sockets open will have a highereffective priority and will receive a greater share of the allocatedbandwidth.

From the above, it will be appreciated that socket status may alsoinclude assigned priorities of sockets. As discussed at length above,assigned priority may be derived from a nearly limitless array ofparameters, and may be set via network policies. The above equationsprovide several examples of allocation and sub-allocation beingperformed based on socket priority. All other things being equal, asocket with a higher socket priority, or an application or computer withhigher priority sockets, will receive larger allocations in several ofthe above examples.

Socket status may also include consumption of past allocations, asshould be appreciated from the above exemplary equations. The aboveexamples provide numerous examples of shifting bandwidth allocationsaway from sockets, applications and computers consuming a lower portionof past allocations, relative to counterpart sockets, applications andcomputers.

Implementation of the above exemplary systems and methods may enablebandwidth resources to be allocated to high priority, critical tasks.Bandwidth may be shifted away from low priority uses, and/or consumingentities that are relatively idle in comparison to their counterparts.The different embodiments above may be implemented to provide a moregranular, or finer, level of control over bandwidth usage of aparticular component within the system. In particular, the bandwidthmanager scheme described above may be configured to control socketand/or transaction level allocation for multiple conversations carriedout by an application. For example, a given application may carry out anumber of functions having varying levels of priority. The increasedgranularity described above allows system operators to assign relativebandwidth priorities to the various tasks carried out by a givenapplication.

Finer control over bandwidth usage may be obtained by providing fordynamic modification of socket priorities after they have been assigned.As discussed above, socket priorities (e.g., socket priorities SP inequations (1) and (3) above) typically are assigned as sockets areopened on the associated computer, and the assigned priority can bederived from virtually any variety of policy parameters. Though thesocket priorities may remain static while the socket is open, the agentmodule may be alternately configured to dynamically vary the socketpriority when certain mission critical network tasks are being performedby the socket. These mission critical tasks may be defined through thepolicy mechanisms discussed above, using any desirable combination ofpolicy parameters. Upon detection of a predefined task, the assignedpriority for the socket is modified (e.g., by overriding the assignedvalue with a higher priority). This modification typically is employedto ensure that a desired level of service or resources is available forthe predefined task (e.g., a desired allocation of bandwidth).Alternatively, the predefined task may be a low priority task, such thatthe socket priority may be dynamically downgraded, to ensure thatbandwidth is available for more important tasks. Any of the aboveembodiments or method implementations may be adapted to provide for suchdynamic modification of priorities.

The granular control at the transaction level (e.g., by the agent moduleembodiments discussed herein) enables the system to detect thecommencement of critical transactions. Upon detection of such atransaction, the system is able to communicate through a chain ofupstream components to ensure that a sufficient amount of bandwidth isreserved to complete the transaction within an acceptable amount oftime. For example, a dynamic increase of socket priority typically willlead to increased allocations for not only the socket, but also for theapplication associated with the socket, and for the computing device onwhich the application is running (e.g., through application of theexemplary equations discussed above). Alternatively, bandwidth may bedynamically decreased for low priority transactions, which may also becommunicated upstream to affect allocations at the various tiers.

For example, the above exemplary systems may be configured so thatmission-critical bandwidth priority is assigned to only a specifiedportion of the data requested from a given web server application. Inthis example, the network policy could specify that when a browser on aclient computer attempts to access a particular web page, that socketpriorities are overridden to ensure that critical data is provided at ahigh level of service. Other data (e.g., trivial graphical information)could be assigned lower priority. More particularly, in commonlyemployed protocols, loading a single web page may involve several “get”commands issued by a client browser application to obtain all of thedata for the web page. The agent modules described herein may beconfigured with a policy that specifies that the agent module is tomonitor its associated computer to look for “get” commands seekingpredefined high priority portions of data on the web page. Upondetection of such a get command, the socket level priority would bedynamically adjusted for that portion of the web page data.

Another example which illustrates the benefit of increased granularityis use of a multi-media application to deliver audio, video and otherdata over a wide-area network. Where network congestion is high, ittypically will be desirable to assign a higher priority to the audiostream. The increased granularity of the described system allows thisprioritization through application of an appropriate network policy, andensures that the video stream does not adversely affect the audiodelivery in a significant way.

The systems described herein may be further provided with a layeredcommunication architecture to facilitate communication betweencomponents. Typical implementations involve use of a communicationlibrary to present a communication abstraction layer to variouscommunicating components. For a given communicating component, theabstraction layer is architecturally configured to hide the details ofthe communications mechanism and/or the location of the othercommunicating components. This allows the components to be transparentto the underlying transport mechanism used. The transport mechanism canbe TCP, UDP or any other transport protocol. The transport protocol canbe replaced with a new protocol without affecting the components andtheir operations.

The components communicate with each other by “passing objects” at thecomponent interface. The communication layer may convert these objectsinto binary data, serialize them to XML format, provide compressionand/or encrypt the information to be transmitted over any media. Theseoperations typically are performed so that they are transparent to thecommunicating components. When applicable, the communications layer alsoincreases network efficiency through multiplexing of severalcommunication streams onto a single connection.

While the present embodiments and method implementations have beenparticularly shown and described, those skilled in the art willunderstand that many variations may be made therein without departingfrom the spirit and scope defined in the following claims. Thedescription should be understood to include all novel and non-obviouscombinations of elements described herein, and claims may be presentedin this or a later application to any novel and non-obvious combinationof these elements. Where the claims recite “a” or “a first” element orthe equivalent thereof, such claims should be understood to includeincorporation of one or more such elements, neither requiring norexcluding two or more such elements.

1. An agent module configured to be loaded onto a computer to manage network bandwidth usage by such computer, the agent module comprising: an agent bandwidth manager; and an application bandwidth manager, the agent module being configured to launch one such application bandwidth manager for each of multiple applications running on the computer, where the agent bandwidth manager and application bandwidth managers are configured to dynamically interact so as to sub-allocate network bandwidth available to the computer into individualized application allocations of network bandwidth for each of the applications, and where each application allocation depends on a socket status of the corresponding application, relative to other applications which are to receive a portion of the network bandwidth available to the computer.
 2. The agent module of claim 1, where each application allocation of network bandwidth depends on how many sockets are open for the corresponding application, relative to other applications which are to receive a portion of the network bandwidth available to the computer.
 3. The agent module of claim 1, where each application allocation of network bandwidth depends on assigned priorities for sockets of the corresponding application, relative to other applications which are to receive a portion of the network bandwidth available to the computer.
 4. The agent module of claim 1, where each application allocation of network bandwidth depends on consumption of prior allocations for the corresponding application, relative to other applications which are to receive a portion of the network bandwidth available to the computer.
 5. The agent module of claim 1, where each application allocation of network bandwidth depends on how many sockets are open for the corresponding application, assigne
 6. The agent module of claim 1, further comprising a socket bandwidth manager, the agent module being configured to launch one such socket bandwidth manager for each of multiple sockets open on the computer.
 7. The agent module of claim 6, where the agent module is configured so that, when one of the applications running on the computer has multiple associated sockets, the application bandwidth manager associated with such application dynamically interacts with the socket bandwidth managers of those sockets so as to sub-allocate the application allocation of network bandwidth into a socket allocation of network bandwidth for each of the sockets.
 8. The agent module of claim 7, where the application bandwidth manager and socket bandwidth managers are configured to dynamically and repeatedly update the socket allocations of network bandwidth in response to changing conditions at the sockets.
 9. The agent module of claim 1, where the agent bandwidth manager and application bandwidth managers are configured to dynamically and repeatedly update the individualized application allocations of network bandwidth in response to changing conditions at the applications.
 10. The agent module of claim 1, where the agent module is adapted to: assign a socket priority to a socket opened on the computer on which the agent module is loaded; provide a socket allocation of network bandwidth to the socket based on the socket priority; and detect whether the socket is being used to perform a predefined network transaction and, after such detection, modify the socket priority for such socket and update the socket allocation of network bandwidth to account for such modification.
 11. An agent module configured to be loaded onto a computer to manage network bandwidth usage by such computer, the agent module comprising: an agent bandwidth manager; and a socket bandwidth manager, the agent module being configured to launch one such socket bandwidth manager for each of multiple sockets open on the computer, where the agent bandwidth manager and the socket bandwidth managers are configured to interact so as to divide network bandwidth available to the computer into individualized socket allocations of network bandwidth for each of the sockets, and where the agent bandwidth manager and socket bandwidth managers are configured to dynamically and repeatedly update the socket allocations based on consumption activity of the sockets relative to each other.
 12. The agent module of claim 11 further comprising: an application bandwidth manager, the agent module being configured to launch one such application bandwidth manager for each of multiple applications running on the computer, where the agent bandwidth manager and application bandwidth managers are configured to interact so as to divide network bandwidth available to the computer into individualized application allocations of network bandwidth for each of the applications; where, for an application with multiple sockets, the corresponding application bandwidth manager and socket bandwidth managers are configured to interact so as to sub-allocate the application allocation of network bandwidth into a socket allocation of network bandwidth for each of the sockets.
 13. A system for dynamically managing bandwidth consumption on a network link comprising: an agent computer; a software-based agent module configured to run on the agent computer, the agent module being further configured to: assign a socket priority to a socket opened on the agent computer; provide a socket allocation of network bandwidth to the socket based on the socket priority; and detect whether the socket is being used to perform a predefined network transaction and, after such detection, modify the socket priority for such socket and update the socket allocation of network bandwidth to account for such modification.
 14. The system of claim 13 wherein the agent module is further configured to: assign a socket priority to each of a plurality of sockets as those sockets are opened on the agent computer; provide a socket allocation of network bandwidth for each of the plurality of sockets, where the socket allocation for each of the plurality of sockets depends on the socket priority of such socket relative to the socket priorities of the other sockets; and detect when one of the sockets is being used to perform a predefined network transaction and, after such detection, modify the socket priority for such socket and update the socket allocations of network bandwidth to account for such modification.
 15. The system of claim 14, where the predefined network transaction pertains to only a portion of data being requested by the agent computer from a network resource, such that the socket priority is modified while the socket is being used to obtain such portion of data, and is unmodified while the socket is being used to obtain other portions of the data being requested from the network resource.
 16. The system of claim 13, where the socket priority is restored to an unmodified value after completion of the predefined network transaction.
 17. The system of claim 13, wherein the network link interconnects a plurality of computers, including a control computer and multiple agent computers, where each agent computer is configured to run one or more applications employing one or more sockets, and wherein the system further comprises: plural agent modules, each being adapted to run on one of the agent computers and monitor and repeatedly report a socket status of its associated agent computer; a control module adapted to run on the control computer, where the control module is adapted to dynamically allocate bandwidth on the network link among the agent computers based on relative socket statuses of the agent computers as reported by the agent modules.
 18. A method of dynamically managing bandwidth in a distributed network system with a network link interconnecting a plurality of computers, comprising: assigning a socket priority to a socket, where the socket is associated with an application running on one of the plurality of computers; providing a socket allocation to the socket, where the socket allocation defines network bandwidth usable by the socket; monitoring network transactions performed using the socket to identify whether the socket is being used to perform a predefined network transaction; and dynamically modifying the socket priority after detecting that the socket is being used to perform the predefined network transaction, and updating the socket allocation to account for such modification of the socket priority.
 19. The method of claim 18, further comprising: providing an application allocation to the application, where the application allocation defines network bandwidth usable by the application and is dependent upon the socket allocation; and dynamically modifying the application allocation to account for modification of the socket priority.
 20. The method of claim 18, further comprising: providing a computer allocation to the computer on which the application is running, where the computer allocation defines network bandwidth usable by such computer and is dependent upon the socket allocation; and dynamically modifying the computer allocation to account for modification of the socket priority. 